A Bayesian Model of Syntax-Directed Tree to String Grammar Induction
نویسندگان
چکیده
Tree based translation models are a compelling means of integrating linguistic information into machine translation. Syntax can inform lexical selection and reordering choices and thereby improve translation quality. Research to date has focussed primarily on decoding with such models, but less on the difficult problem of inducing the bilingual grammar from data. We propose a generative Bayesian model of tree-to-string translation which induces grammars that are both smaller and produce better translations than the previous heuristic two-stage approach which employs a separate word alignment step.
منابع مشابه
یک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی
In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...
متن کاملFine-Grained Tree-to-String Translation Rule Extraction
Tree-to-string translation rules are widely used in linguistically syntax-based statistical machine translation systems. In this paper, we propose to use deep syntactic information for obtaining fine-grained translation rules. A head-driven phrase structure grammar (HPSG) parser is used to obtain the deep syntactic information, which includes a fine-grained description of the syntactic property...
متن کاملInsertion Operator for Bayesian Tree Substitution Grammars
We propose a model that incorporates an insertion operator in Bayesian tree substitution grammars (BTSG). Tree insertion is helpful for modeling syntax patterns accurately with fewer grammar rules than BTSG. The experimental parsing results show that our model outperforms a standard PCFG and BTSG for a small dataset. For a large dataset, our model obtains comparable results to BTSG, making the ...
متن کاملPoverty of the Stimulus? A Rational Approach
The Poverty of the Stimulus (PoS) argument holds that children do not receive enough evidence to infer the existence of core aspects of language, such as the dependence of linguistic rules on hierarchical phrase structure. We reevaluate one version of this argument with a Bayesian model of grammar induction, and show that a rational learner without any initial language-specific biases could lea...
متن کاملHierarchical Chunk-to-String Translation
We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrasebased model and the tree-to-string model, to combine the merits of the two models. With the help of shallow parsing, our model learns rules consisting of words and chunks and meanwhile introduce syntax cohesion. Under the weighed synchronous context-free grammar defined ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009